BEV-SGD: Best Effort Voting SGD Against Byzantine Attacks for Analog-Aggregation-Based Federated Learning Over the Air
نویسندگان
چکیده
As a promising distributed learning technology, analog aggregation based federated over the air (FLOA) provides high communication efficiency and privacy provisioning under edge computing paradigm. When all devices (workers) simultaneously upload their local updates to parameter server (PS) through commonly shared time-frequency resources, PS obtains averaged update only rather than individual ones. While such concurrent transmission scheme reduces latency costs, it unfortunately renders FLOA vulnerable Byzantine attacks. Aiming at Byzantine-resilient FLOA, this paper starts from analyzing channel inversion (CI) mechanism that is widely used for power control in FLOA. Our theoretical analysis indicates although CI can achieve good performance benign scenarios, fails work well with limited defensive capability against Then, we propose novel called best effort voting (BEV) policy integrated stochastic gradient descent (SGD). BEV-SGD enhances robustness of attacks, by allowing workers send maximum transmit power. Under worst-case derive expected convergence rates BEV policies, respectively. The rate comparison reveals our outperforms its counterpart terms better behavior, which verified experimental simulations.
منابع مشابه
Generalized Byzantine-tolerant SGD
We propose three new robust aggregation rules for distributed synchronous Stochastic Gradient Descent (SGD) under a general Byzantine failure model. The attackers can arbitrarily manipulate the data transferred between the servers and the workers in the parameter server (PS) architecture. We prove the Byzantine resilience properties of these aggregation rules. Empirical analysis shows that the ...
متن کاملDeep learning with Elastic Averaging SGD
We study the problem of stochastic optimization for deep learning in the parallel computing environment under communication constraints. A new algorithm is proposed in this setting where the communication and coordination of work among concurrent processes (local workers), is based on an elastic force which links the parameter vectors they compute with a center variable stored by the parameter ...
متن کاملBeating SGD: Learning SVMs in Sublinear Time
We present an optimization approach for linear SVMs based on a stochasticprimal-dual approach, where the primal step is akin to an importance-weightedSGD, and the dual step is a stochastic update on the importance weights. Thisyields an optimization method with a sublinear dependence on the training setsize, and the first method for learning linear SVMs with runtime less the...
متن کاملMusings on Deep Learning: Properties of SGD
We ruminate with a mix of theory and experiments on the optimization and generalization properties of deep convolutional networks trained with Stochastic Gradient Descent in classification tasks. A present perceived puzzle is that deep networks show good predictive performance when overparametrization relative to the number of training data suggests overfitting. We dream an explanation of these...
متن کاملFaster Asynchronous SGD
Asynchronous distributed stochastic gradient descent methods have trouble converging because of stale gradients. A gradient update sent to a parameter server by a client is stale if the parameters used to calculate that gradient have since been updated on the server. Approaches have been proposed to circumvent this problem that quantify staleness in terms of the number of elapsed updates. In th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Internet of Things Journal
سال: 2022
ISSN: ['2372-2541', '2327-4662']
DOI: https://doi.org/10.1109/jiot.2022.3164339